skip to main content


Search for: All records

Creators/Authors contains: "Ho, Ming-Feng"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. ABSTRACT

    We assemble the largest C iv absorption line catalogue to date, leveraging machine learning, specifically Gaussian processes, to remove the need for visual inspection for detecting C iv absorbers. The catalogue contains probabilities classifying the reliability of the absorption system within a quasar spectrum. Our training set was a sub-sample of DR7 spectra that had no detectable C iv absorption in a large visually inspected catalogue. We used Bayesian model selection to decide between our continuum model and our absorption-line models. Using a random hold-out sample of 1301 spectra from all of the 26 030 investigated spectra in DR7 C iv catalogue, we validated our pipeline and obtained an 87 per cent classification performance score. We found good purity and completeness values, both $\sim 80{{\ \rm per\ cent}}$, when a probability of $\sim 95{{\ \rm per\ cent}}$ is used as the threshold. Our pipeline obtained similar C iv redshifts and rest equivalent widths to our training set. Applying our algorithm to 185 425 selected quasar spectra from SDSS DR12, we produce a catalogue of 113 775 C iv doublets with at least 95 per cent confidence. Our catalogue provides maximum a posteriori values and credible intervals for C iv redshift, column density, and Doppler velocity dispersion. We detect C iv absorption systems with a redshift range of 1.37–5.1, including 33 systems with a redshift larger than 5 and 549 absorbers systems with a rest equivalent width greater than 2 Å at more than 95 per cent confidence. Our catalogue can be used to investigate the physical properties of the circumgalactic and intergalactic media.

     
    more » « less
  2. ABSTRACT

    We introduce MF-Box, an extended version of MFEmulator, designed as a fast surrogate for power spectra, trained using N-body simulation suites from various box sizes and particle loads. To demonstrate MF-Box’s effectiveness, we design simulation suites that include low-fidelity (LF) suites (L1 and L2) at 256 and $100 \, \rm {Mpc\, ~}h^{-1}$, each with 1283 particles, and a high-fidelity (HF) suite with 5123 particles at $256 \, \rm {Mpc\, ~}h^{-1}$, representing a higher particle load compared to the LF suites. MF-Box acts as a probabilistic resolution correction function, learning most of the cosmological dependencies from L1 and L2 simulations and rectifying resolution differences with just three HF simulations using a Gaussian process. MF-Box successfully emulates power spectra from our HF testing set with a relative error of $\lt 3~{{\ \rm per\ cent}}$ up to $k \simeq 7 \, h\rm {Mpc}{^{-1}}$ at z ∈ [0, 3], while maintaining a cost similar to our previous multifidelity approach, which was accurate only up to z = 1. The addition of an extra LF node in a smaller box significantly improves emulation accuracy for MF-Box at $k \gt 2 \, h\rm {Mpc}{^{-1}}$, increasing it by a factor of 10. We conduct an error analysis of MF-Box based on computational budget, providing guidance for optimizing budget allocation per fidelity node. Our proposed MF-Box enables future surveys to efficiently combine simulation suites of varying quality, effectively expanding the range of emulation capabilities while ensuring cost efficiency.

     
    more » « less
  3. Abstract

    From the formation mechanisms of stars and compact objects to nuclear physics, modern astronomy frequently leverages surveys to understand populations of objects to answer fundamental questions. The population of dark and isolated compact objects in the Galaxy contains critical information related to many of these topics, but is only practically accessible via gravitational microlensing. However, photometric microlensing observables are degenerate for different types of lenses, and one can seldom classify an event as involving either a compact object or stellar lens on its own. To address this difficulty, we apply a Bayesian framework that treats lens type probabilistically and jointly with a lens population model. This method allows lens population characteristics to be inferred despite intrinsic uncertainty in the lens class of any single event. We investigate this method’s effectiveness on a simulated ground-based photometric survey in the context of characterizing a hypothetical population of primordial black holes (PBHs) with an average mass of 30M. On simulated data, our method outperforms current black hole (BH) lens identification pipelines and characterizes different subpopulations of lenses while jointly constraining the PBH contribution to dark matter to ≈25%. Key to robust inference, our method can marginalize over population model uncertainty. We find the lower mass cutoff for stellar origin BHs, a key observable in understanding the BH mass gap, particularly difficult to infer in our simulations. This work lays the foundation for cutting-edge PBH abundance constraints to be extracted from current photometric microlensing surveys.

     
    more » « less
  4. ABSTRACT

    In this work, we extend our recently developed multifidelity emulation technique to the simulated Lyman-α forest flux power spectrum. Multifidelity emulation allows interpolation of simulation outputs between cosmological parameters using many cheap low-fidelity simulations and a few expensive high-fidelity simulations. Using a test suite of small-box (30 Mpc h−1) simulations, we show that multifidelity emulation is able to reproduce the Lyman-α forest flux power spectrum well, achieving an average accuracy when compared to a test suite of $0.8\, {\rm {per\ cent}}$. We further show that it has a substantially increased accuracy over single-fidelity emulators, constructed using either the high- or low-fidelity simulations only. In particular, it allows the extension of an existing simulation suite to smaller scales and higher redshifts.

     
    more » « less
  5. Abstract We present methods for emulating the matter power spectrum by combining information from cosmological N-body simulations at different resolutions. An emulator allows estimation of simulation output by interpolating across the parameter space of a limited number of simulations. We present the first implementation in cosmology of multi-fidelity emulation, where many low-resolution simulations are combined with a few high-resolution simulations to achieve an increased emulation accuracy. The power spectrum’s dependence on cosmology is learned from the low-resolution simulations, which are in turn calibrated using high-resolution simulations. We show that our multi-fidelity emulator predicts high-fidelity counterparts to percent-level relative accuracy when using only 3 high-fidelity simulations and outperforms a single-fidelity emulator that uses 11 simulations, although we do not attempt to produce a converged emulator with high absolute accuracy. With a fixed number of high-fidelity training simulations, we show that our multi-fidelity emulator is ≃ 100 times better than a single-fidelity emulator at k ≤ 2 hMpc−1, and ≃ 20 times better at 3 ≤ k < 6.4 hMpc−1. Multi-fidelity emulation is fast to train, using only a simple modification to standard Gaussian processes. Our proposed emulator shows a new way to predict non-linear scales by fusing simulations from different fidelities. 
    more » « less
  6. null (Ed.)
    ABSTRACT We present a new catalogue of Damped Lyman-α absorbers from SDSS DR16Q, as well as new estimates of their statistical properties. Our estimates are computed with the Gaussian process models presented in Garnett et al., Ho, Bird & Garnett with an improved model for marginalizing uncertainty in the mean optical depth of each quasar. We compute the column density distribution function (CDDF) at 2 < z < 5, the line density (dN/dX), and the neutral hydrogen density (ΩDLA). Our Gaussian process model provides a posterior probability distribution of the number of DLAs per spectrum, thus allowing unbiased probabilistic predictions of the statistics of DLA populations even with the noisiest data. We measure a non-zero column density distribution function for $N_{\rm {HI}}\lt 3 \times 10^{22} \, \rm {cm}^{-2}$ with $95{{\ \rm per\ cent}}$ confidence limits, and $N_{\rm {HI}}\lesssim 10^{22} \, \rm {cm}^{-2}$ for spectra with signal-to-noise ratios >4. Our results for DLA line density and total hydrogen density are consistent with previous measurements. Despite a small bias due to the poorly measured blue edges of the spectra, we demonstrate that our new model can measure the DLA population statistics when the DLA is in the Lyman-β forest region. We verify our results are not sensitive to the signal-to-noise ratios and redshifts of the background quasars although a residual correlation remains for detections from zQSO < 2.5, indicating some residual systematics when applying our models on very short spectra, where the SDSS spectral observing window only covers part of the Lyman-α forest. 
    more » « less
  7. null (Ed.)
    ABSTRACT We present a revised version of our automated technique using Gaussian processes (GPs) to detect damped Lyman α absorbers (DLAs) along quasar (QSO) sightlines. The main improvement is to allow our GP pipeline to detect multiple DLAs along a single sightline. Our DLA detections are regularized by an improved model for the absorption from the Lyman α forest that improves performance at high redshift. We also introduce a model for unresolved sub-DLAs that reduces misclassifications of absorbers without detectable damping wings. We compare our results to those of two different large-scale DLA catalogues and provide a catalogue of the processed results of our GP pipeline using 158 825 Lyman α spectra from SDSS data release 12. We present updated estimates for the statistical properties of DLAs, including the column density distribution function, line density (dN/dX), and neutral hydrogen density (ΩDLA). 
    more » « less
  8. null (Ed.)
    ABSTRACT We develop an automated technique to measure quasar redshifts in the Baryon Oscillation Spectroscopic Survey of the Sloan Digital Sky Survey (SDSS). Our technique is an extension of an earlier Gaussian process method for detecting damped Lyman α absorbers (DLAs) in quasar spectra with known redshifts. We apply this technique to a subsample of SDSS DR12 with BAL quasars removed and redshift larger than 2.15. We show that we are broadly competitive to existing quasar redshift estimators, disagreeing with the PCA redshift by more than 0.5 in only $0.38{{\ \rm per\ cent}}$ of spectra. Our method produces a probabilistic density function for the quasar redshift, allowing quasar redshift uncertainty to be propagated to downstream users. We apply this method to detecting DLAs, accounting in a Bayesian fashion for redshift uncertainty. Compared to our earlier method with a known quasar redshift, we have a moderate decrease in our ability to detect DLAs, predominantly in the noisiest spectra. The area under curve drops from 0.96 to 0.91. Our code is publicly available. 
    more » « less